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ABSTRACT 

Massive stellar clusters are the best available laboratories to study the mass func- 
tion of stars. Based on NTT/SofI near- infrared photometry, we have investigated the 
properties of the massive young cluster Westerlund 1. From comparison with stel- 
lar models, we derived an extinction Ak^ = 0.91 ± 0.05 mag, an age t = 4 zL 0.5 
Myr and a distance d — 4.0 ± 0.2 kpc for Westerlund 1, as well as a total mass of 
-Wwdi = 4.9llj!|'4g X IO^Mq. Using spatially dependent completeness corrections we 
performed a 2D study of the cluster's IMF and, in addition, of the stellar density pro- 
files of the cluster as a function of mass. From both IMF slope variations and stellar 
density, we find strong evidence of mass segregation. For a cluster with some 10^ stars, 
this is not expected at such a young age as the result of two-body relaxation alone. 
We also confirm previous findings on the elongation of Westerlund 1; assuming an 
elliptical density profile, we found an axis ratio of a:b ~ 3:2. Rapid mass segregation 
and elongation could be well explained as the results of subclusters merging during 
the formation of Westerlund 1 . 

Key words: open clusters and associations: individual: Westerlund 1 - 

Hertzsprung- Russell and colour-magnitude diagrams - stars: luminosity function, mass 
function - stars: evolution - stars: formation - stars: pre-main sequence 



1 INTRODUCTION 

Westerlund 1 (Wd 1) is among the most massive young clus- 
ters in the Local Group. Recent studies have rev ived interest 
in th is cluster, discovered already 50 years ago jWesterlundl 
Il96ll ). Several of these studies focus on the rich population 
of massive stars tha t are spectroscopic a lly identified as Wd 1 
members (see e.g. Clark et al. 20051: Negueruela Sz ClarkI 
l2005l : ICrowther et al.ll2006l : iNeeueruela et al.ll201oi r Among 
this population it has been possible to find Wolf-Rayet stars, 
evolved OB stars, and short-lived transitional objects like 
Luminous Blue Variables and Yellow Hypergiants. Wd 1 is 
the only case in which such a rich population of these very 
rare objects is observable. This makes Wd 1 one of the most 
important templates for understanding the evolution of very 
massive stars after they leave the main sequence. One of 
the great advantages is that the progenitor's mass of the 
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evolved massive stars can be inferred from the observable 
main-sequence turn-off. While the massive stars are bright 
enough to be observed at optical wavelengths, observations 
of the intermediate- and low-mass stellar population is best 
performed in the near-infrared, given t he high extinction to- 
ward s Wd 1 of ~ 10 - 12 mag (|Piatti. Bica fc Clarial 
Il998h . A study of the Wd 1 po pulation below ^ SOMq 
has been recently carried out by iBrandner et al.l l|2008l ) - 
hereafter Paper I-. In this paper we present comprehensive 
analysis of the data described in Paper I. 
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Wd 1 represents probably the best template in the 
Milky Way to understand the cluster mode of star forma- 
tion that can be observed in other galaxies, like the Anten- 
nae Galaxies, where Super Star Clu sters with masses large r 
10^ Mq have been detected (see e.g. IWhitmore et al.ll2O10l ). 

In addition to the study of the intriguing formation 
scenario of such massive extragalactic clusters, Wd 1 may 
also serve as a template to understand the interplay be- 
tween evolution of massive stars and dynamical processes 
that may lead to the formation of stable, bound and re- 
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laxed globular clusters. Given its mass, Wd 1 may indeed 
be able to retain a substantial fraction of its initia l stella r 
population, even though, according to 

Im uno et all l|2006t ). 

it has probably undergone ~ 65 supernova events. These, 
in addition to stellar winds and ionizing radiation from the 
most massive stars, have dispersed the residual gas reser- 
voir of the cluster, decreasing the gravitational binding en- 
ergy of the system. If massive enough to resist disruption, 
Wd 1 will eventually turn into a closed, virialized system. 
A study of the dynamical sta t us of Wd 1 has been made 
bv iMengel fc Tacconi-GarmanI (|2009l ). The authors use the 
measured radial velocity dispersion for a group of 10 mas- 
sive stars to infer a dynamical mass of 1.5 X 10^ M0, on the 
upper end of the Wd 1 mass estimates available in the lit- 
erature. To derive this number, the authors assume virial 
equilibrium and isotropy of the stellar motions hence their 
estimate is an upper limit. The analysis of star clusters' dy- 
namical and structural parameters often assumes spherical 
symmetry. Hence, the cluster properties, like the IMF slope, 
the stellar density profiles, the stars' velocity distributions, 
are described as ID functions depending on the distance 
from the centre of the cluster. However the spherical sym- 
metry assumption may not be valid, and this is the case of 
Wd 1. Several studies have already shown that Wd 1 has 
indee d an elongated sh ape, based on X-R ay diffuse emis- 
sion uno et al]|2006l ) and stellar counts (|Brandner et al.l 
|2008| ). Therefore, an unbiased study, which does not assume 
a priori any symmetry for the geometry of Wd 1, is needed 
to properly investigate the spatial properties of the cluster. 

We focus our attention on the study of mass segre- 
gation, global and spatially varying IMF and overall clus- 
ter shape as can be derived by the study of density pro- 
files. These macroscopic properties are, in turn, related to 
the formation history of the cluster, its internal dynamical 
evolution and its global interactions with the rest of the 
Galaxy. We developed new analysis techniques to take into 
account the observational biases related to the presence of 
many very bright objects that can hamper a quantitative 
determination of both the IMF slope and the stellar den- 
sity profiles. The most important improvement compared to 
Paper I is that we drop any spherical symmetry assump- 
tion, regarding the cluster structure. Hence the complete- 
ness maps, the photometric errors and the density profiles 
are all obtained in a 2D approach. In addition, new stellar 
evolutionary models are used for comparison with observa- 
tions. A probabilistic approach is developed to determine 
cluster memberships, using a nearby off-cluster image as a 
control frame for the field population. Stellar masses are 
derived using a maximum likelihood technique, taking into 
account realistic photometric errors and their correlations. 
IMF slopes are inferred using an approach which does not 
require any binning but makes use of all the information con- 
tained in each star's mass-probability-density-distribution. 
We use 2D-elliptical-generaliza t ion of the radial density pro- 
files bydi son. Fall fc FreemanI l| 19871 ) to obtain shape prop- 
erties of Wd 1 (e.g. its semimajor axis, elongation and ori- 
entation). 

The paper structure is as follows: we describe the data 
set used in Sect. [2] The technique to build completeness 
maps is introduced in Sect. [31 In Sect. 3] we use simulated 
stars to obtain photometric errors and their correlation. A 
statistical field subtraction method is introduced in Sect. [5] 
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Figure 1. SOFI Ks band image of Wd 1. Superimposed are 
Ks 50% completeness contours. The labels correspond to the Kq 
magnitudes for which completeness is 50% along the contour. 



After the description of the adopted stellar models (Sect. 
[6]), we use them and the clean Colour Magnitude Diagram 
of Wd 1 to infer its properties like extinction, age and dis- 
tance (Sect. [7)|. An approach to obtain, for each star, its 
mass-probability-distribution (given the adopted models) is 
shown in Sect. [8] where we also derive the global IMF slope 
and the variation of the IMF slope across the cluster. In 
Sect. [9] we build cluster density profiles and analyze them 
using elliptical models. We also quantify the extent of mass 
segregation. The last section deals with our conclusions. 



2 THE DATA 

The data set used, the reduction process and the photomet- 
ric analysis and calibration have been extensively described 
in Paper I; hence we will only provide a short summary here. 
NTT/Sofl J audits broad band observations of Wd 1 (cen- 
tered on RA(2000) = 16''47™03', Dec(2000) = -45°50'37") 
and of a nearby comparison field (offset by ~ 7' to the East 
and ~ 13' to the South of Wd 1), each covering an area of 
4!5 X 4^5 were retrieved from the ESO archive (PI: J.Alves). 

Dat a reduction was performed using the eclipse jitter 
routines |Deviiia3[2Qo3) . Point Spread Function (PSF) fit- 
ting photometry wa s derived usin g the IRAF implementa- 
tion of DAOPHOT l|Stetsonlll987l ). The number of objects 
detected in both J&xiAKs bands is ~ 7000 for the Wd 1 
field and « 5300 for the comparison field. Photometric zero 
points and colour terms were computed by comparison of in- 
strumental magnitudes of relatively isolated, bright sources 
with counterparts in the 2MASS Point Source Catalogue 
jSkrutskie et al.ll2006t) . 
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3 2D COMPLETENESS MAPS 

To obtain a correct cluster IMF and for the analysis of Wd 1 
density profile, it is necessary to derive appropriate incom- 
pleteness corrections. In Paper I the authors considered com- 
pleteness correction as a function of magnitude and distance 
from the centre of Wd 1. In the present work we drop the 
assumption of radial symmetry and build incompleteness 
correction maps as a function of the position on the chip 
and of the magnitude. 

The main source of incompleteness in our case is crowd- 
ing, which severely aff ects seeing limited observations (see 
lEisenhauer et al ]| 19981 . as an example). The effects of crowd- 
ing on the detection of point sources change according to two 
quantities: the average stellar density and the magnitude 
contrast between the given point source and its neighbours. 
Both of these quantities may not follow a radially symmet- 
ric or regular distribution. Very bright objects are normally 
scattered over the field in a nonuniform way. Even when 
they have a regular distribution, they still can cause sudden 
and very well localized drops in the completeness. In addi- 
tion each of them has its own brightness and causes a lack of 
detections in areas of different angular width over the chip. 
Stellar density itself does not a priori have to follow a sym- 
metric distribution; indeed the actual number of stars for a 
given position is determined by an interplay of several fac- 
tors, e.g. the intrinsic spatial distribution of stars within the 
cluster, varying extinction pattern (in the foreground, but 
also within the cluster) or changes in the fore- and back- 
ground population characteristics, for example within the 
spiral arms. For these reasons we think that an approach 
that does not assume any spatial distribution in the com- 
pleteness characteristics of an observed field is preferable, 
in contrast to integrated or averaged cluster characteris- 
tics, and is definitely recommendable when spatial proper- 
ties have to be investigated. For each photometric band we 
built a function with 3 variables: 

where \i. is the actual value of the magnitude (in the Mj 
band) and {x,y) the position at which completeness is eval- 
uated. It is then possible to associate an incompleteness 
correction to each star for each photometric band. The to- 
tal incompleteness correction for a star detected in both 
J audio's bands is the product of the single corrections in 
each band. The reason is that each of these corrections rep- 
resents the probability of detecting that given star in that 
specific band, and detections in each band are independent 
from each other. The Cj completeness maps have been ob- 
tained in several steps, which are detailed in Appendix 1X1 

A visualization of the completeness pattern for Wd 1 is 
shown in Fig.[T] We display the Ks band image of the cluster 
with superimposed 50% completeness magnitude-loci. The 
contours are labeled with the corresponding values of Kq 
magnitudes for which completeness drops to 50 %. Such con- 
tours follow the general distribution of stars, but also show 
peaks around the brightest stars, as expected; from Fig. [T] 
it is clear that radial symmetry is not a perfect assumption 
for the completeness distribution of Wd 1. 

A comparison of the completeness values between Wd 1 
frame and the off-cluster frame, for both photometric bands, 
is shown in Fig. [2l Given the spatial dependence of the com- 
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Figure 2. Comparison of the completeness values for tlie Wd 1 
stars (black dots) and the stars in the off-cluster frame (red dots) 
for J and ii's bands. The green dots represent stars in Wd 1 frame, 
with angular distance from the cluster's centre larger than 2'. 

pleteness for the Wd 1 frame's stars, for them there is not 
an unique value of the completeness at a given magnitude; 
for what concerns the off-cluster frame, we assumed spa- 
tial uniformity for the completeness, so the off-cluster frame 
stars (red dots) have unique values of the completeness as a 
function of magnitude (see Sect. IA4| |. Figure [2] shows that 
the completeness for the off-frame stars is always higher, at 
a given magnitude, than the average completeness for the 
Wd 1 frame stars. Similarly, 50% incompleteness is reached 
for the control field at ~ 1 mag fainter than the average 50% 
incompleteness for Wd 1 field. The cause of this difference 
may be found in the different degree of crowding of the two 
fields. The green dots in the figure represent stars in Wd 1 
frame located at more than 2' from the centre of the clus- 
ter, corresponding to ~ 2.3 pc at the cluster's distance of 
4 kpc (see Sect. I7.2|l . Even though these latter stars show 
- as expected - the highest completeness values for Wd 1 
frame, they still have slightly lower completeness than the 
off-cluster frame stars. This is a reason to believe that crowd- 
ing in this "peripheral" regions of the cluster frame is still 
higher than in the off-cluster frame, a hint to the presence 
of a low-mass cluster stellar population extending quite far 
away from the cluster centre. In Sect. [9] we will show evi- 
dence that the low-mass stars of Wd 1 may indeed occupy 
a region with a radius of the order or even larger than 3 pc. 



4 PHOTOMETRIC ERRORS 

As shown in Paper I, the DAOPHOT photometric errors are 
usually an underestimate of the true errors. DAOPHOT er- 
rors are connected to the residuals in the PSF fitting of the 
stellar counts. In principle this error estimate is absolutely 
correct for isolated stars only, so that the light is coming 
from the source of interest alone and only if the analytical 
PSF model chosen for PSF fitting is the correct represen- 
tation of the true PSF shape. In this ideal case the errors 
would come only from the Poisson noise in stellar counts. 
In crowded fields, however, there are additional sources of 
uncertainty. The main one is the presence of bright objects. 
Even though the light from these sources is iteratively sub- 
tracted from the frame by the PSF fitting algorithm, the 
unsubtracted noise in the wings of these objects can still 
affect the magnitude estimate of nearby faint stars. Stel- 
lar crowding itself can cause problems when the algorithm 
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Figure 3. The newly derived photometric errors as a function of 
magnitude in Jandi^s- Errors for the stars in the cluster's field 
in black, errors for the off-frame stars in red. 



has to disentangle very close sources even when they have 
similar magnitudes. We use simulated stars to estimate re- 
alistic errors as a function of magnitude and position of the 
stars. The new estimates of the photometric errors are de- 
rived from the difference between the inserted and recovered 
magnitude of the simulated stars. In addition, we examine 
the correlation between the estimated magnitude errors in 
the J and the Ks bands. The details of errors evaluation are 
given in Appendix [B] Our error estimates are shown in Fig. 
[3] as a function of magnitude. 



5 SUBTRACTION OF THE FIELD STARS 

We developed a technique for field subtraction based on a 
probabilistic approach. The technique takes into account the 
photometric errors, their correlation and the information 
about completeness. The natural space for our approach is 
an N-dimensional magnitude space. The technique is quite 
general and, as long as photometric errors in different bands 
and their correlations are evaluated, does not have to be 
limited to two bands. In the case of Wd 1 we only used J 
and Ks bands hence we will refer explicitly to them. 

In the ideal case, a cluster magnitude-magnitude dia- 
gram (MMD) would look exactly the same as in the off- 
cluster field, plus additional stars belonging to the cluster, 
possibly following a separate sequence in the diagram, along 
an isochrone. It should be possible to compute the stellar 
densities at each MMD position for both the on-cluster and 
the off-cluster frames and compare them. Regions with an 
overdensity of stars would correspond to regions occupied 
by cluster members. The difficult part in the on-off density 
comparison is to compute a proper density. Usually this is 
accomplished by gridding the CMD and by computing a 
density at each grid cell. Then, according to the numbers in 
the cluster cells and in the off-field cells, some stars are sub- 
tracted, usually by making use of Monte Carlo techniques. 
This approach has been very successful in many applica- 
tions, also in our Paper I. Anyway, any gridding or binning 
procedure always implies a loss of information. Gridding is 
usually performed using equal cells, and this does not take 
into account, for example, the fact that photometric errors 
increase with magnitude, making it less obvious to which 
cell a faint star should belong. On the bright parts of the 
CMD the grid size may instead be very large compared to 



the photometric errors. In this case the gridding would re- 
sult in combining stars that, if errors would be reliable, are 
very distant from each other - in units of their aphot - and 
then should not be considered "similar" and assigned to the 
same cell. We decided to change this approach and to cal- 
culate the density of stars locally, at each position in the 
cluster's MMD where a star is located. Then we calculated 
the density in the same point of the MMD, but for the off- 
frame population. The ratio of the two densities is a measure 
of the membership probability of the star that is in that po- 
sition in the cluster's MMD. 

According to its photometric errors, each star is not a 
single point in the MMD, but a multi-dimensional Gaussian 
cloud of probability, representing the chance of observing 
that object in that position. In our 2D case these Gaussians 
have an elliptical symmetry with semi-axis represented by 
cr J and (jjfs and a tilt in the MMD related to the correlation 
between the two magnitude errors. Since Gaussian proba- 
bility is greater than everywhere in the MMD, each star 
contributes a bit to the total density at each MMD position, 
the closest stars to that position having higher weight. Given 
a star with magnitudes (J*, A"s*) we define the density at 
its position in the MMD in the following way: 
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(1) 



where the asterisk refers to the star at whose position the 
density is evaluated, such that Cj, and Cks * are the com- 
pleteness fractions for that object, while Cj. andCxs^are the 
completeness fractions for the other stars. The density is 
calculated in both the on and the off field MMD, hence the 
index i may run respectively on the stars in one or the other 
field. The vectors and the S matrix are, respectively, the 
measured magnitudes and the covariance matrix associated 
to them: 



J-t/i 



Gj rajaKs 



(2) 



|S| is the determinant of the correlation matrix, and r is 
the Pearson's correlation coefficient of equation (|B1|) . The 
M vector is the vector of coordinates (J, Ks) over which the 
integration is actually performed. The integration is ideally 
performed in the whole (infinite) magnitude space. For ob- 
vious reasons we limit the numeric integration around each 
star to a region within ±5 cr, for each coordinate. 

Equation ([1} deserves several comments. Its meaning is 
the following: the contribution of the i-th star to the den- 
sity at the (J*,-ft's*) position is the integral of the product 
of that star's probability distribution, convolved with the 
probability distribution of the *-th star. Then the total den- 
sity in that position is the sum over all the i stars either in 
the on-field or in the off-field. The probability of each single 
star is normalized to 1, as it has to be, but it is impor- 
tant to consider the completeness factors and tt-^ for 

M = J,Ks that appear in equation ((l}. These factors ac- 
count for the missing detections in both the science and the 
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Figure 4. Upper panels: on and off field CMDs. Lower panels: Results of the subtraction process. The colour coding indicates the 
rejection probability 7?.rej = / ■ Red represents very likely members {Ti-Yei = 0) and purple very unlikely ones (TJ^ej = !)• 



control field. It is easy to understand why such correction is 
necessary. Imagine a star in the cluster field, with complete- 
ness factor 0.25; it means that if we detected that object, 
then (in an statistical sense), there are 3 other similar ob- 
jects that we were unable to detect. Now imagine that at the 
same position in the off-cluster MMD we would detect two 
objects both with completeness factors equal to 1. Neglect 
for a moment the real "cloud" shape of the stars' probability 
density distributions, and consider them ideally as points in 
the MMD. By computing densities without the completeness 
corrections, we would obtain = 1 and = 2. Hence we 
would oversubtract that star from the cluster's MMD. On 
the other hand, the completeness factor tells us that the ac- 
tual value of p°" is not 1 but 4, and then we would subtract 
that object only in 2/4 = 0.5 cases or, better said, we would 
assign to that star a 50% membership probability (see also 
below). 

Once we have both the on- and off-field densities at a 
given star's location in the MMD, we can compare them. 
The ratio TJrcj ~ / defines a rejection probability; the 
higher the contrast in the two densities - the lower TZtcj is 



- the more likely the object is a member. On the opposite 
side, when we are in a region of the MMD where no cluster 
members are present, this number approaches 1. Hence each 
detected object has its associated membership probability. 
To decide whether or not to keep it in the catalogue of mem- 
ber stars, we extracted uniform random numbers G [0, 1]. 
Then if C < 7^ rcj we discard the object, otherwise we keep it. 
This also means that in the following analysis the actual cat- 
alogues that we used may differ from one an other, because 
some stars may be sometimes excluded or included accord- 
ing to this random selection. The uncertainties related to 
this selection directly propagate into, e.g., the IMF slope 
evaluation. To account for this we used multiple catalogue 
realizations and evaluate the uncertainties in the outcoming 
results as the scatter in the results (e.g. the IMF slope, see 
l8.2l for more details). 

In Fig. H we show the CMDfQ of Wd 1 frame and of 

^ Even though all the procedure is performed in the MMD, for 
the sake of clarity we show the most commonly used CMDs, where 
the usual characteristics of a cluster population are better visible 
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the control frame, used as a reference for the field popula- 
tion, together with the results of the subtraction process. 
The colour coding in the lower panels indicates the rejec- 
tion probability, TZrcj ■ As already mentioned in Paper I, and 
as is clearly visible in the upper panels of the figure, the 
foreground and - especially - the background population in 
the two frames do not look really similar. A possible cause 
for the differences might be a different amount of extinction 
along the different lines of sight in the on and off field. This 
population differences cause an under-subtraction of stars 
in certain regions of the CMD. However, it is clear that the 
most likely members in the lower left panel (the red points) 
follow a well defined cluster sequence; nevertheless some iso- 
lated fore and background stars in the cluster's frame also 
show an artificially high membership probability. The rea- 
son is that there are no objects in the off field MMD at the 
same position. 

To avoid such artificial contamination, in addition to 
the subtraction process, we used a cr-clipping of our CMD 
(see Appendix [Cj . After finding the best fitting isochrone 
(see Sects. [6] and [7)1, we decided to keep only stars that lie 
within 3 a from it, i.e. those stars that satisfy the criterion 
I J* - Jisoc! < 3 X a(J,) audio's* - ifsisocj < 3 x (j{Ks,) for 
at least one point ( Jisoc, Ji's isoc) on the isochrone. 



Table 1. Mass, temperatures and magnitude offsets at the transi- 
tion masses for our combined isochrones. The PS-AT9 offsets are: 
mag(Pisa-AT9) - mag(Padova) at M = 4M0 while the PS-PHX 



offsets are: mag(Pisa-PHX) - mag(Pisa-AT9) at M = 2M| 



0- 



Model 



Mass 



Tcff 
[lO^K] 



Offsets [mag] 
J H Ks 



Padova M ^ 4 

PS-AT9 2 ^ M s£ 4 
PS-PHX M ^ 2 



Teff > 15.4 
5.8 ^T^ff ^15.4 
Teff ^ 5.8 



0.04 
0.01 



0.02 
0.02 



0.04 
0.01 



compared to the expected absolute precision in our photom- 
etry, which, taking into account the zero point errors is of 
the order of 0.05-0.1 mag. 



7 FUNDAMENTAL PARAMETERS OF WD 1 

Before preceding with the spatially dependent analysis, we 
derived the global, average properties of Wd 1 using the 
combined isochrones described above. 



6 THE STELLAR MODELS 

In the following analysis, we use a combination of P adova 
main sequence (MS) isochrones llMarigo et al.l 20081) and 
Pisa pre-main sequence (PMS) models ( Degl'Imiocenti et al.l 
|2008| ). Padova models are accessible on the web|3 and are al- 
ready provided in the 2MASS photometric system. For Pisa 
isochrones, we performed the conversion from the theoret- 
ical HR Diagra m to the observational 2M ASS-CMD our- 
selves. We used iBrott fc tfauschildtl (|2005l ) spectra, calcu- 
lated with the PHOENIX model atmos phere code f or the 
lowest temperature regions and iCasteUi fc Kurucj (|2003l ') 
spectra, based on ATLAS9 model atmospheres for the high- 
est temperature in the PMS isochrones (see Table [l]). As 
in Paper I, we assumed a solar chemical composition for 
Wd 1, hence both the MS and PMS models used here have 
this composition. Nevertheless, given the intrinsic differences 
in the evolutionary codes (opacity tables, EOS, heavy el- 
ements mixture) and given also the fact that the "Solar" 
composition is not exactly the same in the two sets of mod- 
els, they show some differences in the region of overlap. 
Small differences are present also between the set of PMS 
isochrones transformed with PHOENIX and ATLAS9 model 
atmospheres. We have carefully chosen the masses for the 
transition from one set of models to the other, in order to 
minimize the differences in colour between them. The colour 
differences are shown in Table[Tl together with the mass and 
temperature ranges in which we adopt each model. The Pisa- 
ATLAS9 isochrones have been shifted in order to match the 
Padova isochrones at 4Mq, and the Pisa-PHOENIX have 
been shifted to match the Pisa- ATLAS9 ones at 2 Mq . Ta- 
ble [1] shows that the offsets are quite small, specially when 
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7.1 Reddening and extinction 

For high mass stars on the MS, the near-infrared part of 
the spectrum is very well approximated by the Rayleigh- 
Jeans tail of a black body with temperature Tcff. Then, for 
masses above ~ 5 Mq , given that the SED shape is almost 
unchanged, the near-infrared, J — Ks , colours stay constant 
(and around mag). The upper main sequence (UMS) in 
such a CMD looks simply like a vertical line. Therefore it is 
possible to estimate the reddening towards Wd 1 by fitting 
the J — Ks colour of the UMS. To perform the fit, we used 
the stars for which Ks < 13.5 mag and 1.2 < J — Ks < 2.0 
mag and minimized the quantity: 



Y^\{J - Ks), ~ {J - Ksh 



where j runs over the selected stars and the isochrone colour 
is taken at the same i^s of the j-th star. The J — Ks 
colour selection reduces the contamination by stars clearly 
belonging to the foreground or background population. Once 
the J — Ks reddening has been estimated, extinction Ak^ 
is computed using an extinction law. Since, by definition, 
EjKs ~ Aj — Aks we have: 



A 



E. 



JKs 



The knowledge of the interstellar extinction law provides 
the missing Aj IAko ratio. While in Pa per I we adopted 
the widely used lRieke fc Lebofskvl l|l985l ) -hereafter RL85- 
extinc tion law, in the present work we use the much more 
recent iNishivama et all 1I2OO6I ) one -hereafter N06-. The au- 
thors make use of a large number of red-clump stars located 
in the galactic plane. These stars have intrinsically similar 
colours, hence the observed differences in colour are related 
to different amount of interstellar absorption. Red-clump 
stars describe a straight line in the (H — Ks, J — H) dia- 
gram parallel to the reddening vector. Hence the slope of 
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this line can be used to determine the A,j : Ah ■ Ak^ selec- 
tive absorption ratios. In addition to the largely improved 
statistics, as compared to the few sources available in RL85, 
the N06 selective absorption has the advantage of having be- 
ing measured using a Ks filter, while RL85 used K. Hence 
the former provides a result that is in the same photometric 
system as our data. The J-to-_ft's selective absorption ra- 
tio in the NOB case is given by Aj/Ak^ = 3.021, slightly 
higher than the Aj/Ak = 2.518 from RL85. We checked 
that the obtained Aks value actually does not depend on 
the age of the adopted isochrone. Our best fitting isochrone 
of 4 Myr (see also l7.2p provides a value of Ako, = 0.907 mag; 
if isochrones in the range 3-8 Myr are used, the scatter in 
the inferred Aks is less than 0.01 mag. To estimate the er- 
ror on the extinction value, we followed this reasoning. The 
absolute scatter in J — Ks colour of the UMS stars used for 
the reddening fitting described above is about 0.2 mag. This 
means that a reasonable estimate for the reddening fitting 
error is 0.1 mag. From this, and using the NOG reddening 
law coefficients, it follows that the error on the inferred to- 
tal extinction can be estimated as Aj4ifg = 0.05 mag. 

Given this errors and the results of Paper I, with 
Ako^ — 1.13 ± 0.03 mag, it may seem that our new find- 
ings are inconsistent with the previous ones. Nevertheless 
one always has to keep in mind two crucial sources of sys- 
tematic uncertainty in the method used and that are not 
included in the error estimates above. One is of course the 
choice of the stellar models, which may differ from one an- 
other both in the theoretical HR diagram and also in the 
transformations used to convert temperatures and luminosi- 
ties into colours and magnitudes. A difference of 0.05 mag in 
the intrinsic near-infrared colours of UMS stars is anything 
but unexpected. We compared the Padova models used in 
the present w ork with the Geneva models used in Paper I 
(jLeieune fc Sc hacrcr 2001. '). using in both cases solar metal- 
licity and an age of 4 Myr and 3.9 Myr, respectively. We 
observed differences in J — Ks intrinsic colour ranging from 
0.03 to 0.1 mag, at a given magnitude, in the mass inter- 
val from 5 to 30 Mq, used for the reddening estimate. The 
other source of systematic uncertainty, is the aforementioned 
choice of the reddening law. In Paper I we used the RL85 
law, and given that the selective absorption ratios are quite 
different between RL85 and NOB, this explains the difference 
in our previous and new results for the total extinction. 

To compare our findings with those by other authors, 
our best value cannot be directly converted into an Ay 
using only the NOG law. This law ha s indeed been obtained 
only from J-band redwards (see also iNishivama et al.ll2009l . 
for the extension of the NOG reddening law towards photo- 
metric bands redder than Ks)- Hence we use a combination 
of Aj/Aks = 3.021 ratio from NOG and the Av/Aj = 3.54G 
ratio from RL85 to obtain Av =9.7 mag. As already no- 
ticed in Paper I different authors report values of Av that 
vary in the range from 9.4 to ~ 12.0 mag, so our final value 
is included well within this range. 

Recently iNegueruela et all l|2010l ) have observed the 
presence of differential reddening across Wd 1. They report 
a range of AEyi ~ 1.4 mag. This range can be converted 
into a range of A_Ej_ffs ~ 0.51 mag using again a combina- 
tion of RL85 and NOG laws, matched at the J-band. The 
observed colour range for the UMS members in our data 
set is somewhat smaller than this and part of this spread is 
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Figure 5. One realization of the clean CMD after field subtrac- 
tion and (T-clipping. Error bars represent the average photometric 
errors per magnitude bin. The best fitting combined isochrone is 
also shown, with three colours indicating the three different parts 
of which the isochrone is composed (see Sect. |6] and Table [T]|. 
Some values of the stellar mass are also shown for illustration. 



probably due also to photometric errors and undetected bi- 
narity. Hence differential reddening across the cluster cannot 
be excluded, but Ncguorucla et al. (201011 extinction spread 
has to be regarded as an upper limit. 



7.2 Distance and age 

As illustrated in Paper I, the morphology of the PMS-MS 
transition region and of the whole PMS can be combined 
as a good age indicator for young clusters. Since extinction 
is determined independently (see 17. ip . the distance modu- 
lus, DM, and the age, r, can be determined without having 
extinction as a free parameter. Good age constraint is pro- 
vided by those stars that have just entered the MS. These 
stars are located at the base of the vertical MS and have 
14.9 <Ks< 15.1 mag and l.G <J-Ks< 1.8 mag. No clus- 
ter members are present at magnitudes immediately fainter 
than that (see lower left panel in Fig. This Zero- Age- 
Main-Sequence (ZAMS) region is very well identifiable in 
the cluster's CMD and can be used to anchor the isochrones 
position. It is worth mentioning that the determination of 
the age and DM in this paper is not done by a real fitting 
procedure, but through the conventional superposition of 
different isochrones for several values of the pair (DM, r). 
The DM and age values would be degenerate if only the 
ZAMS position would have been used for their determi- 
nation. A slightly older isochrone would have an intrinsi- 
cally fainter ZAMS point, and this could be compensated 
by a reduction of the DM. Isochrones of different ages, how- 
ever, also show different colours for the PMS branch, the 
younger, the redder. Hence, in our comparison, after try- 
ing to reproduce the ZAMS point, we also take into ac- 
count the shape of the PMS-to-MS transition region and 
the PMS colour. The uncertainty on the DM determina- 
tion can be reasonably quantified as ADM = 0.1 mag from 
the magnitude extension of the ZAMS region. The mini- 
mum age uncertainty that we can quote is instead half of 
the spacing between the different isochrones in our grid, i.e. 
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0.5 Myr. By isochrone superposition we obtain our fiducial 
values of DM — 13.0 ± 0.1 mag (corresponding to a distance 
d = 4.0 ± 0.2 kpc) and r = 4 ± 0.5 Myr. 

In Paper I we found values of DM = 12.75 ± 0.10 mag 
(d = 3.55 ± 0.17 kpc) and tpms = 3.2 Myr for the PMS 
population, while the MS stars provided weaker constraints 
on the age with tms between 3 and 5 Myr. The use of more 
recent PMS models partially reco nciles our findin g s wit h 
those of other authors. For example ICrowther" et al.l (|2006l l. 
by comparing the number of WR stars and of cool hyper- 
giants, find DM = 13.4 (d = 4.8 kpc) and r = 4 5 or 5 
Myr. From observations of H I, iKothes fc Dougherty! (l2007l l 
find a distance d — 3.9 ± 0.7 kpc. iNegueruela et al.l (|2010|), 



from a comparison o f their spectroscop i cally classified ob- 
jects with models bv lMevnet fc Maeded (|2000t l. favour val- 
ues of d > 5 kpc and r > 5 Myr. The authors point out the 
difficulties in spectral classification for several objects, the 
approximate character of the Toff scale, the uncertainties in 
Mv values and, finally, the uncertainty in stellar evolution- 
ary models for massive stars. T he values of d ~_5 kpc and 
r ~ 5 Myr are also supported bv lRitchie et al.l l|2010h . where 
the authors derive constraints on these quantities from the 
study a massive, interacting, eclipsing-binary. Clearly there 
are still difficulties in the determination of the distance and 
age for Wd 1 with different methods providing slightly dif- 
ferent values. Nevertheless with the present paper the dif- 
ferences between the values inferred using the intermediate- 
and low-mass end of the stellar population, on one side, and 
the high mass end on the other, are somehow reduced. 



8 THE IMF OF WD 1 

The comparison of observed magnitudes with isochrones al- 
lows the determination of stellar masses. The mass probabil- 
ity distribution for each star was determined by taking into 
account the magnitude errors and their correlation. The dis- 
tributions for the single stars are then combined to build the 
IMF of Wd 1. The detailed information on the completeness 
pattern across the field allows us to explore the variations 
of the IMF slope within Wd 1. In the following, we consider 
all our objects as single stars, nevertheless we are aware of 
the possible biases introduced by n eglecting the presence of 
binaries (see lMaiz Apellanij|2009h . This will be accounted 
for in an upcoming paper. 



8.1 The mass of the single stars 

Given our best- fitting isochrone (see Sect. [7)l we used a 
maximum-likelihood approach to determine the mass of the 
member stars. Again we work in the magnitude-magnitude 
space. There the probability density distribution of a star is 
characterized by its average magnitudes, by their photomet- 
ric errors and by the correlation among them. Isochrones in 
the MMD are curves parameterized by the mass value of the 
star, m. Hence the probability of a star with mass m and 
magnitudes M(m) — {J{m), Ks{m) ) to be observed at the 
fi, = (J*,ifs.) location in the MMD is: 

p(m) ^ 



27tlS,|l/2 

exp |-i [M(m) - fj^f S,"^ [M(m) - 




Figure 6. Left: Mass function for Wd 1; the dashed box indicates 
the region that is used for the fit of the IMF slope. Right: Zoomed 
version for the dashed box region; 7 = 2.44 and = 12200 are our 
best estimates of the power law coefficient and IMF normalization 
constant, respectively. Red lines correspond to the completeness- 
corrected function while the uncorrected function is shown in 
black for comparison. The blue line in the right panel is the best- 
fitting power-law. 



(see equations ([T]) and ^ for a definition of the symbols). 

Notice that J^^^p{m) dm = 1, hence p{m) represents 
a probability density distribution. With this approach, we 
can determine not only the most likely mass for each star, by 
maximizing p{m), but also the reliability of the mass value 
obtained. If a star is indeed located very far from the best-fit 
isochrone (in units of its photometric a), then its p{m) will 
be a very broad function, with a poorly determined peak. On 
the contrary if the star lies exactly on the isochrone then, 
ideally, p(m) will be a Dirac S function. 



8.2 IMF slope and total mass determination 

A standard approach to evaluate the IMF slope of a cluster 
is to build a histogram of the stellar masses and then fit a 
power law (or a log-normal distribution) to the histogram. 
It is known, however, that the value of the slope is quite sen- 
sitive to the way the binning is performed and even to the 
space in which the fitting is done, i.e. a linear or logarithmic 
space for the mass coordinate (see e.g. iMafz Apellanij[2009l . 
for an exhaustive description of the subject). These problems 
were also discussed in Paper I where we showed that the cu- 
mulative mass distribution, not requiring any binning, can 
be used to give stronger constraints on the IMF slope. Here 
we introduce an alternative method that does not require 
any binning and makes use of the fundamental information 
on the mass probability distribution, p{m), which is always 
ignored when only the best-mass values are used, even with- 
out binning. Given the pi (m) for each star in Wd 1 we define 
the observed Mass Function: 



djV(m) 
dm 



(4) 



The 

dm 



(3) 



function for the whole Wd 1 population is 
shown in Fig. [6] We used a restricted range of masses to 
determine the global slope of the IMF, 7, where = 
A X m ' with normalization constant A and 7 = 2.3 for a 
typical Salpeter or Kroupa IMF, in the mass regime above 
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0.5 Mq (|Salpetej|l95"5l : lKroupall200ll ). The lower mass limit 
for the slope fit is chosen to be rrimin = 3.5 Mq. At this 
mass we have 50% global completeness on the whole frame. 
Locally this value could be different. For example, in the very 
centre of the cluster, high incompleteness is reached at high 
values of the stellar mass (see the lower panels of Fig. [S]). 
This may cause some additional uncertainty on the derived 
IMF slope. The effects of spatially-varying-incompleteness 
are investigated in detail in Sect. 18.31 where the potential 
bias in the cluster centre is also analyzed. The upper mass 
limit for the slope fit is chosen to be mmax = 27 Mq. The 
resons for this limit are: i) the magnitude limit of our data- 
set. Stars more massive than this are above the linearity 
regime of the NTT/SofI observations that we have used, 
ii) stars above this mass are close to the turn-off region, 
according to Padova isochrones. Hence the determination 
of their initial masses starts to be age-dependent, and the 
complex post-MS evolution of such massive stars is quite 
uncertain, from the theoretical point of view, iii) the fitting 
procedure: above this mass value the numbers become so 
small that statistical fiuctuation are not negligible and could 
lead to a bad fit. 

We show the results in bi- logarithmic plots, but the ac- 
tual fit has been performed in a linear space. The global IMF 
slope we obtain is 7 = 2.44lQ Qg, slightly steeper than an or- 
dinary Salpeter/Kroupa IMF. We will explore in Sect. 18.31 
local departures from this behaviour. For the normalization 
constant we found A = 1.22j;Q j4 x 10''. The best values and 
uncertainty of 7 and .4 are evaluated by using a bootstrap 
technique, as detailed in Appendix [D] Given the couple of 
values (7i,Ai) obtained from a single boostrap sample, it 
is possible to associate to them a value of the total mass 
and total number of stars for Wd I. We extrapolate the 
power law with index 7^ in the range m[MQ] £ [0.5, 120]. 
The upper mass limit is a reasonable estimate of the highest 
stellar mass that is expected to form in a massive cluster 
as Wd 1. From Padova isochrones, we have that stars with 
initial masses larger than ~ 65 Mq are supposed to have al- 
ready undergone supernova explosions at the estimate clus- 
ter age of 4 Myr. Hence our results are estimates of the total 
initial mass and total initial number of stars for the cluster, 
under the assumption that the Present Day Mass Function 
is representative of the IMF value. For masses below 0.5 Mq 
and down to the hydrogen burning limit, i.e., 0.08 Mq we 
used the Kroupa IMF slope for this stellar regime, with a 
7 = 1.3. In Appendix |D] we also show how we derived the 
best estimates for the total number of stars and the total 
mass of the cluster, given the set of Ntot,i and Mtot,i from 
the different bootstrap samples. The total number of stars 
is Ntot = 1.04tj g3 X 10^ while the total mass of the cluster 
is estimated to be Mtot = 'iMtoH >^ IO^Mq. 

Our present findings, based on a more complete and 
thorough approach, confirm the findings of Paper I and are 
on the lower end of the recent literature estimates for the 
mass of Wd 1. Using the MS turn-off mass and the identi- 
fied post-MS member, by extrapola t ion o f a Kroupa IMF 
down to lower masses, IClark et al.l (|2005l ') found a some- 
what higher value for the total mass of ~ lO'^ Mq. Part of 
this discrepancy could be ascribed to the model-dependent 
uncertainties in the determination of the progenitor mass 
for the post-MS identified members. Additionally one has 
to be cautious when counting only the very massive stars 



to normalize the Kroupa IMF and then extrapolate it all 
the way down to low-mass stars. Only few young clusters 
in the Milky Way are known for which the IMF can be 
actually measured up to this masses, hence the nature of 
the IMF and its exact form is not known with great cer- 
tainty in this regime. Moreover, also in the case that a stan- 
dard IMF is valid for the very massive stars, high stochas- 
tic (Poissonian) fiuctuations are expected when the num- 
bers become small as towards the very high mass end of the 
W d 1 population. A completely differ ent approach was used 
bv lMengel fc Tacconi-GarmanI (|2009l ') to determine a gravi- 
tating mass of Wd 1, Mdy„ = 1.51q ? x 10^ Mq. The authors 
measured the radial velocity of ~ 10 stars from their spec- 
tra. From the dispersion of these velocity measurements, the 
total mass of the system is derived, under the hypothesis of 
virial equilibrium, using the following equation: 



Mdyn ~ 



G 



here rhp is the projected half-mass radius, a the velocity 
dispersion and r; is a factor that the authors use under 
the additional assumption of isotropy. Possible pulsations 
in the 5 yellow hypergiants (YHGs) of the sample, which 
would cause a wrong estimate of their radial velocities, may 
cause an overestimate of the true a. I Ritchie et al.l (|200g i 
demonstrate indeed that one of the YHG observed in Wd 1, 
W243, shows a very complex, time-varying spectrum with 
signs of pulsation and mass loss that may hamper a pre- 
cise dete rmination of the radial velocity. This star is not in 
the M engel fc Tacconi-Ga rman (2009) sample, but it exem- 
plifies that velocity dispersions derived from radial veloc- 
ity measurements of evolved stars can lead to an overesti- 
mate of the true dispersion. In addition to this, we think 
that part of the discrepancy in the inferred dynamical mass 
could derive from the fact that Wd 1 is actually non spher- 
ical (see Sect.|5)) and this anisotropy might be refiected also 
in the stellar motions. Therefore the rj factor used by the 
authors should be slightly modified, possibly giving better 
agreement with other findings. Indeed the velocity distribu- 
tion seems to be not isotropic from our preliminary analy- 
sis of stellar proper motion using multi-epoch near-infrared 
AO d ata (Kudryavtseva et al. in preparation). On the other 
hand iFleck et al.l l|2006l ) showed that the 77 parameter is a 
time-dependent quantity, which changes rapidly, specially 
in very rich clusters, due to the effects of mass segrega- 
ti on. The authors found that the u se of an r; ~ 10, like 
in lMengel fc Tacconi-GarmanI (|2009h . may lead to underes- 
timates of clusters masses. We will show in the following 
that Wd 1 is mass-segregated. In such a case an increase in 
ri is needed to correctly estimate its dynamical mass. This 
would lead to an even stronger discrepancy with our photo- 
metric mass estimate. A possible interpretation of this differ- 
ence could be that Wd 1 is indeed out of virial equilibrium, 
with stellar motions still not relaxed after the gas-expulsion 
phase that followed the first supernovae explosions. An ef- 
fect that could balance the effects of mass segregation on 
the 77 value is the inclusion of binaries in the estimates of 
this parameter. Binaries orbital motions increase the mea- 
sured value of the velocity dispersion; consequently the true 
mass of a cluster is overestimated if the binary contribution 
is not properly taken into account. iKouwenhoven &: de Griii 
|20o9) showed the dependecy of the rj value on binary prop- 
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erties and cluster density. For the densest clusters (A*' > 
lO'^ stars pc~^) the bulk motions dominate the total value of 
0"^ while for the sparsest clusters, with TV ~ 0.1 stars pc~^, 
the velocity dispersion is fully dominated by orbital mo- 
tions. Wd 1 density is in between these two extreme values. 
In this case the dynamical mass can be overestimated by 
10-100% , depending on the properties o f the b inary pop- 
ulation. iGieles. Sana fc Portegies Zwarj l|2010ah , including 
mass-dependent mass-to-light ratio of stars and the intrinsi- 
cally different binary properties of massive stars, found that 
the contribution to from binaries orbital motions is al- 
ready very important for young (~ 10 Myr), moderately 
massive (M ~ 10^ Mq) and compact ( rup ~ 1 pc) star 
clusters, comparable to Wd 1. 

8.3 Spatial variability of the IMF 

In Paper I we have shown that, considering concentric an- 
nuli centered on Wd 1 centre, and computing the slope of the 
IMF for the stars in the annuli, there is a tendency for a flat- 
tening of the IMF when going closer to the centre. The IMF 
slope was computed using stars more massive than 3.4Mq 
only, to avoid any bias due to the lower degree of complete- 
ness in the crowded centre of the cluster. Still, close to the 
brightest stars, 50% incompleteness is reached already at 
higher masses of up to 6Mq. This may still cause an artifi- 
cially flatter IMF in the central parts of Wd 1, because, even 
though the IMF slope is obtained using the incompleteness 
corrected number of detected stars, the correction itself be- 
comes quite uncertain when one uses it for much lower levels 
than 50% completeness. We have also shown that the com- 
pleteness pattern in Wd 1 is not really radially symmetric 
and we will show in the following Sect. |9]that the shape of 
the cluster itself is elongated; hence using concentric annuli 
can smooth out some of the intrinsic spatial variations of the 
IMF. With our new approach we determine the IMF slope 
locally, in order to follow its real pattern within Wd 1. 

To calculate the IMF slope at each position, we used a 
moving box, 200 pixels in size. The slope was obtained by 
the same technique described in Sect. 18.21 applied only to the 
stars in the box. At each position we additionally selected 
stars such that the total completeness factor Cj x Ck^ is 
always higher than some fixed threshold values. In this way 
we can compare results for varying completeness thresholds. 
Hence, at each position the minimum mass considered can be 
different. The upper mass limit is determined by the stochas- 
tic distribution of non-saturated high-mass stars within the 
moving box. The fit is performed only when the number 
of stars inside the box is larger than 30. The calculation 
is repeated at each pixel. Anyway we are forced to use a 
moving box that is much bigger than the sampling scale be- 
cause we need enough stars to perform the IMF slope fit. 
Hence the adjacent-pixel slope-values are not independent 
from each other. The final maps are obtained by convolv- 
ing the adjacent-pixel slope-values with a Gaussian kernel 
of FWHM = 100 pixels (half-box size) At the distance of 
Wd 1 (« 4kpc) and with a plate scale of O'.'29/pixel 100 
pixels corresponds to ~ 0.5 pc in linear scale. 

Results for three values of Cj x Ck^ ~ 0.125, 0.25 
and 0.375, are shown in Fig. [T] With our definitions, the 
conventional 50% threshold in one band is replaced by a 
0.5 X 0.5 — 0.25 combined threshold. The green areas cor- 



respond to an ordinary Kroupa-like slope (7 ~ 2.3). The 
yellow-red areas in the centre indicate regions with a flat- 
ter IMF, i.e., with more high- mass stars than what is pre- 
dicted by a Kroupa IMF. Blue-purple areas are areas with 
a paucity of massive stars. Hence, when fitting a power law 
we obtain a very steep function due to the overabundance 
of low-mass stars. The contours in the three panels are the 
contours for the CJ^^^^l x Cks haif function, i.e. the product of 
the J and Ks 50% completeness magnitudes. These contours 
trace the shape of the total completeness correction factor. 

The overall pattern of the IMF slope remains unchanged 
among the three different maps of Fig. [T] Nevertheless, some 
differences can be noticed. Going to lower and lower com- 
pleteness thresholds (i.e., from right to left in the figure), the 
yellow region in the centre of the image "shrinks", leaving 
space for regions of slightly steeper IMF around it. Hence, 
when completeness corrections are properly taken into ac- 
count, there are strong hints that low-mass stars are over- 
abundant outside the very centre of Wd 1. On the other 
hand, the yellow-red regions still visible in the centre of the 
maps, even for the lowest completeness threshold, indicate 
an overabundance of massive stars that is very likely to be 
intrinsic and not just a result of missing detections in the 
low-mass end. A similar "shrinking" behaviour is observed 
for the "purple" outer regions that are very pronounced in 
the two rightmost panel of the Fig.[7]and less in the leftmost. 
In this case the effect is due to the difference in the mass in- 
tervals used for the fit of the IMF slope. At higher complete- 
ness thresholds, only few mass-points are available, and the 
differences in number counts between the low-mass and high 
mass limits within the fitting interval are very high when 
few high- mass stars are present. When lowering the com- 
pleteness threshold, star counts are added at lower masses, 
hence the observed mass function becomes more "regular" 
and the results of the fit of the slope are less extreme. 

Given the low number of stars towards the high-mass 
end, statistical fluctuations in this regime may increase the 
uncertainty of the IMF slope. To see whether this effect 
is important, we compared the 7 values for all stars with 
m > miini and the 7 values for stars with mum < m < 7 Mq 
only. The results are shown in the upper panels of Fig. [8] 
The grey regions in the centre are regions where the num- 
ber of stars in the fitting interval was too low to perform a 
good fit. Excluding these regions, it is clear that the differ- 
ence between the two slopes is almost everywhere zero. This 
tells us that the fit is dominated by the low-mass regime of 
the fitting interval, where the stars are more numerous and 
where the overall shape of the IMF is very well determined 
since statistical fluctuations are less pronounced. The only 
differences between the two slopes are observed in the very 
centre, where rriura becomes very close to the upper-mass 
limit of 7 Mq. In these regions, indicated by a cyan colour, 
the inferred IMF is flatter when the high-mass end is ne- 
glected. Anyway, the incompleteness level in the very centre 
is high, hence these small differences (A7 < 0.3) cannot be 
considered significant. 

Summarizing, we can say that the overall IMF shape 
is consistent with a Salpeter or Kroupa galactic IMF in the 
range of masses between 3.5 and 27 Mq. This slope is the 
spatial average of a slope that varies across Wd 1. A trend 
in the local IMF slope values can be observed in Fig. [T] with 
central regions having flatter IMF compared to the outer 
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Figure 7. 2D maps of the IMF slope of Wd 1. Within our definition, the colour coding corresponds to values of —7. The three maps are 
built using only stars with a completeness factor, Cj X Cxg 1 down to 0.125, 0.25, 0.375 (from left to right); overplotted are the contours 
of Jhait X ^Shalf- The X and y axes correspond to RA and Dec offsets, in arcminutes, relative to the centre of the reduced image, (RA, 
Dec) = (16''47™06=, -45°50'33") 
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Figure 8. Upper panels: Differences in the 7 values using stars between the completeness-threshold mass and 7Mq and using all 
stars above the threshold mass. Lower panels: Values of the completeness-threshold mass. Columns from left to right correspond to 
Cj X Cifg = 0.125, 0.25 and 0.5 respectively. Gray areas are areas with not enough stars to perform a reliable fit of the IMF slope. The x 
and y axes meaning is the same as in Fig. [7] 
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regions of the cluster. This is a robust indication that Wd 1 
is mass segregated. We will show additional evidence of this 
mass segregation in Sect. O where we will also discuss its 
possible origins. 



9 MORPHOLOGY OF WD 1 

Sev eral recent studies indicate that Wd 1 is elongated (see 
e.g. iMuno et al.l l2006l ) . In our Paper I, assuming an el- 
liptical shape with a and 6 as semi- major and semi-minor 
axis respectively, we found an ellipticity of the cluster, 
= 1 — - = 0.19, when stars with masses in the range 10 
to 32 Mq were considered. The value slightly decreased, to 
T] = 0.15, using masses between 3.5 and 10 Mq. Elongation 
was computed by calculating the half-mass radius as a func- 
tion of the position angle (PA), considering for each PA only 
stars within ±45° around PA and, correspondingly, around 
PA -I- 180°. We also showed that the overall surface mass 
density profile of the cluster follows a E(r) oc [l + (r/a)^] 
radial law fsee lElson et al.l 19871 . hereafter EFF87), with core 
radius related to the a parameter by EFF87 equation (22), 
i.e., rc « a{2'^'l^ - 1)^/'^ and /3 = 2 for Wd 1. At large 
distances from the center, the 3D density profile goes like 
p(r) oc r-'^P-^ -see EFF87 equations (13a) and (13b). Hence 
an index /3 = 2 for the 2D density profile implies a 3D den- 
sity t hat goes like , which corresponds to a IPlummeJ 
l|l91l[ l model. K j3 = 0.5 corresponds instead to an isother- 
mal sphere with 3D density going like . The density pro- 
file of Wd 1 falls more rapidly compared to the case of R136 
cluster in the Large Magellanic Cloud. This cluster has a 
mass comparable to that of Wd 1 and slightly younger age 
of ~ 3 Myr b ut shows a profile tha t is closer to isothermal, 
with P X 0.8 l|Andersen et al.ll2009l : ICampbell et al, 2010 ) 

Our 2D incompleteness mapping enables a study of the 
cluster's 2D stellar density distribution. We calculated the 
surface number density for several mass ranges and used 
four values for the completeness threshold. Given a lower- 
mass threshold, miow, and a completeness threshold, Ctr, we 
considered all the stars above these thresholds for calculat- 
ing the stellar surface number density. The number density 
was computed using a moving box 100 pixels in siz^fl, which 
was moved pixel-by-pixel. After counting the stars at each 
position, we convolved the counts with a Gaussian kernel 
of FWHM = 50 pixels, i.e. half-box-size, to account for the 
fact that the density value computed at each pixel position is 
not independent of the values computed at nearby pixels. In 
this way we have been able to build smooth number density 
maps for Wd 1. These density profiles are always elongated, 
hence we decided to perform a fit by using an elliptical gen- 
eralization of the EFF87 profile, a natural extension of the 
work done in Paper I. We will refer to this profile as GEFF. 

The GEFF profile can be described in the following way: 



Sgeff = Sbg -f Sc (l + 



(5) 



where Sbg is a stellar background density, Ec is the density 
in the centre and F represents the density decay for large 
distances from the centre (L 3> 1). 

^ This is half of the box size used for the computation of the local 
variations of the IMF (see Sect. 1531 . Since here we just need to 
count the stars, we do not need large numbers within the box. 



The quantity L is given by: 
2 / ,\ 2 

+ 



(6) 



In analogy with the EFF87 a parameter, which is related to 
the core radius, a and b are related to the core semi-major 
and semi- minor axes of the elliptically symmetric GEFF pro- 
file. The quantities x' and y' are given by: 
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(7) 



where {xc,yc) are the pixel-coordinates of the centre of the 
ellipse and 9 is the tilt angle between the semimajor axis 
and the x axis, measured counterclockwise. 

Summarizing, a GEFF profile has 8 different parame- 
ters: P = (Sbg, Sc, F, a, 6, Xc, 2/c, &)■ In the fit we left all of 
them free, apart from the exponent F. Since the equivalent 
exponent for an EFF87 profile was found to be /3 = 2 in 
Paper I, we constrained our F to stay between 1 and 3. 

Moreover, given that the completeness correction in the 
very centre of Wd 1 may be uncertain, we performed the fit 
by neglecting the region in which the stars have, on aver- 
age, a completeness factor smaller than 0.25. An example of 
stellar density contours and the relative GEFF fit is given 
in Fig. E 

9.1 Results of the GEFF fit 

We calculated the stellar density for several combinations 
of the mass and completeness thresholds; the values used 
are: mfM©] = [2.5, 3.5, 4.5, 6., 7.5, 10., 12.5, 15.] and C = 
[0.125,0.25,0.375,0.5]. A summary of the outcome of the 
GEFF fit for all these combinations can be found in Fig. 1101 
For clarity, we emphasize that, given the value of the mass 
threshold, we consider all the stars with m > miow. Hence, 
in the plots of Fig. 1101 going from right to left along the mass 
axis, it is possible to see the cumulative effect of including 
lower and lower mass stars. The different symbols represent 
different completeness thresholds, as indicated. 

In the left panel of the figure we show the eccentricity, 

e = \J ^ ~ ( a)^' With this definition, the ellipticity values, 
ri — 0.15and0.19, of Paper I become e — 0.53 and 0.59, re- 
spectively. The eccentricity values are almost constant with 
mass. Their average values are somewhat higher than what 
found in Paper I, with e ~ 0.75 indicating an axis ratio 
a : 6 = 3 : 2. The fact that the numbers are slightly different 
compared to Paper I is not surprising. The adoption of a 
radially symmetric completeness correction in Paper I has 
partially smoothed out some of the asymmetry and intrin- 
sic elongation of the cluster. Our new results clearly reveal 
the elongated 2D density distribution of main-sequence stars 
with masses between ~ 3 and ~ 30 A/q , with higher elon- 
gation observed for lower mass stars. This might be related 
to the fact that more massive stars are also more centrally 
concentrated (see below), hence their average collision time 
is shorter than that of the less massive stars. Consequently, 
massive stars undergo more dynamical interactions and their 
momenta become more isotropic. 

For completeness values between 0.125 and 0.5 we also 
performed a least-square fit of the semi-major axis values, 
using the functional relation: 
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Figure 9. Example of surface density contours and a fit using the GEFF profile, in pixel coordinates on the frame. Lejt: stars with 
m > 7.5 M0 and C > 0.375; completeness factors for the single stars are colour-coded. Right: the corresponding density contours 
(in colours). The density is in arbitrary units and the colour-coding goes from the minimum to the maximum density. The gray area 
corresponds to the area where the average completeness for the whole cluster's population is below 0.25; this area is masked-out when 
performing the GEFF profile fit. The results of the fit are displayed as black elliptical contours. 



a^kx [log(m)]'' . (8) 

The results of the fits are shown in the central panel of 
Fig. 1101 In general the cumulative semimajor axis decreases 
almost hke l/log(m). We already found an indication of 
mass segregation by investigating the IMF spatial variations 
(see Sect. [8]). The finding here confirms that massive stars 
are more centrally concentrated. The figure also shows that 
adopting a lower completeness threshold, the actual size of 
the semi-major axis decreases, at fixed mass. The reason is 
the inclusion of more and more stars in the centre of the 
cluster, where, of course, the total completeness is lower. 
Consequently, going to lower completeness thresholds the 
stellar density has a more pronounced peak in the centre, 
while the density in the outer regions of the cluster does not 
change as much. Since a is a measure of the length-scale of 
the density decay with distance from the centre, we obtain 
lower a values when the density contrast between the centre 
and the outskirts is more pronounced. 

From Fig. 1101 right panel, it is also very interesting to 
note that Wd I's direction of elongation lies very close to 
the galactic plane. 



9.2 Possible sources of elongation 

In the following, we carry out a qualitative discussion of pos- 
sible sources of elongation. The typical orbital period for a 
star at Ipc distance from a central point source with mass 
M = 10^ M0 is about td ^ 3 x 10^ Myr. This timescale is 
much shorter than the typical half-mass relaxation time of 
about trh ~ 10* — 10® yr, as determined for a typical globular 
cluster -comparable in stellar mass to Wd 1-, and defined as 
the time required for the central h alf of the clu ster mass to 
reach equilibrium (see Sect. 1.1 in lSpitzei1ll987l ). Given the 
difference in the two timescales, it is clear that the observed 
deviation from spherical symmetry cannot be ascribed to 
the global evolution of the cluster as an isolated system; af- 
ter few orbits and encounters, the phase-space distribution 
of stars for an isolated system is expected to be isotropic (in 



v) and spherically symmetric (in r). The deviation from the 
spherical cluster shape must be explained either by unusual 
initial conditions still reflected in the present cluster appear- 
ance or by some interaction with the rest of the Galaxy. 
A net angular momentum of the giant molecular cloud form- 
ing Wd 1 or a formation of Wd 1 out of two or more subclus- 
ters might either be responsible for its elongated shape. Dif- 
ferential galactic rotation exerts a shear on molecular clouds 
which might lead t o a net angular momentum . Acco rding to 
a recent study by iBallesteros-Paredes et al] (|2009l ). galac- 
tic shear and tides have rather strong effects on initially 
elongated clouds, eventually quenching star formation and 
disrupting the clouds. Hence differential rotation is an un- 
likely source for the elongation of Wd 1. 
An other intriguing possibility would be a "hierarchical" for- 
mation scenario, with merging of two or more smaller sub- 
clusters. The existence of a non-negligible fraction of pos- 
sible binary clusters is supported by ob servations (see e.g. 
Ide la Fuente Marcos fc de la Fuente Marcos ,200§ ) . A hier- 
archical organization of the ISM and of young st ellar groups 
and c l usters is indeed observe d on many scales (jElmegreenl 
l2009l ). lNegueruela et al.l ||2010D report the presence of a sub- 
clump of massive stars in the S-E region of Wd 1, even 
though they also warn that fluctuations in the star counts 
could be responsible for this observed clump. IClark et al.l 
suggest that an age spread within the Wd 1 mas- 
sive star population is really unlikely, hence any possible 
merging or capture event must have happened in the very 
beginning of the cluster's history. Otherwise this episode 
could have happened also more recently, but under the 
condition that the subclusters are coeval, i.e. star forma- 
tion has occurred at almost the same time in different re- 
gions of the giant molecular cloud. If this scenario would 
be true, the modest amount of dynamical crossing times 
occurred from Wd 1 formation (age ~ 10 td) may be the 
reason why the stellar motion have not yet reached an 
isotropic distribution. No dynamical simulations that in- 
clude such a large number of particles as Wd 1 members 
have been performed so far. Nevertheless, recent studies 
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Figure 10. Left: Eccentricity values as a function of the minimum mass; Middle: Semi-major axis values as a function of the minimum 
mass; the quoted k and q values are obtained by fitting functions as described in equation (0; Right: Tilt angle betwreen semi-major axis 
and X axis, measured counterclockwise; each of the concentric rings helps to distinguish the different values as a function of the minimum 
mass used. Different symbols correspond to different completeness thresholds. 



suggest that merging is in deed possible over a wide range 
of initial conditions (see IPorteeies Zwart fc Ruslil I2OO7I : 



Ide la Fuente Marcos fc de la Fuente Marcos! 2010l . and ref- 
erences therein). 

We evaluate whether tidal effects in the Galactic central field 
could be responsible for the Wd 1 shape. Under the simplify- 
ing assumption of a circular restricted 3-body problem, with 
primary mass at the Galactic Centre position and secondary 
at the cluster centre, we find that tl, i.e. the distance of the 
inner Lagrangian point from Wd f centre is: 

3 Mmw 

where Mwdi = 5 x IQ'^Mq is our Wd f mass estimate, 
Mmw = 6 X lO" Mq is the mass of the Milky Way and 
Rg = 4kpc is the Wd 1 galactocentric distance. With these 
numbers, we obtain tl = 12 pc; given that Afwdi ^ A/mw 
this is also the distance for the external Lagrangian point 
from the Wd 1 centre. This estimate for the tidal radius is 
a lower limit, since it was assumed that all the mass of the 
Galaxy is confined within the orbit of Wdl. A more cor- 
rect estimate, taking into account only the enclosed galactic 
mass at radius Rg, would lead to an even larger value of 
rL- Consequently, it is clear that Wd 1 is far from filling its 
Roche lobe, while we measure elongation already on a scale 
of ~ 1 pc from the cluster core. Hence tidal distortion from 
the whole Galaxy is unlikely the reason for the elongation 
of Wd 1. 

An other important tidal effect could be caused by the galac- 
tic disk gravitational field. The disk potential is costant far 
away from the galactic midplane, where the matter density 
distribution of the disk drops to zero. On the contrary within 
the disk the potential has a non-zero divergence in the di- 
rection perpendicular to the plane (Z axis). This divergence 
causes a net acceleration of the stellar motions in the Z di- 
rection, and, as a result, an initially spherical and isotropic 
cluster moving across the disk midplane is compressed along 
the Z direction. This phenomenon is known as "compressive 



gravitational sho ck" and an ana lytical solution is presented 
in Chapter 5 by ISpitzeij l| 19871 ) . Unfortunately the condi- 
tions that are required t o ap ply this analytical treatment 
do not hold for Wd 1. In ISpi tzcr (1987) the galactic plane 
tidal field is treated as a fast time-dependent perturbation 
to the motion of stars within Globular Clusters which cross 
the midplane at a speed of hundreds of km/s. Hence the du- 
ration of the perturbation is short compared to the stellar 
orbital period around the cluster centre. However, Wd 1 is 
moving much slower in the Z direction. From a preliminary 
analysis of our AO multi-epoch observation, we can set a 
limit on the net bulk motion of < 10 km/s along the galactic 
longitude coordina te h (Kudryav tseva et al. in preparation). 
For this reason the lSpitzeil l|l987t ) analytical solution cannot 
be used here, but it could be worth to investigate the effects 
of the disk tidal field in detail with dynamical simulations. 

9.3 The effective cumulative radius 

As an alternative to study of mass segregation, in addition 
to the estimate of the GEFF best fit semi-major axis, a, 
we used an independent quantity that we call the effective 
cumulative radius: 



rcff (m) 



r C— 



y 



mi>ra \CjiXCKo 



(9) 



This quantity is obtained by taking all the stars with mass 
rui > m and computing their geometric-averaged distance 
from the centre of the cluster. The distance from the cen- 
tre for the single stars is r^; the completeness factors, d 
are needed to take into account, in a statistical sense, the 
undetected sources. 

Figure [TT] shows the quantity rofi as a function of mass. 
Looking from right to left it is evident that the inclusion of 
less and less massive stars in the computation of rcff leads 
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to an increase of the average distance from the centre of 
the cluster, meaning that more massive stars are located on 
average at smaller effective radii compared to less massive 
stars. The first few points on the right of the plot don't fol- 
low this relation. This is due to the fact that the definition 
of an average distance for the most massive stars is prob- 
lematic, given the low numbers considered. Indeed, the most 
massive star in our data set is a bit off-centre, hence the ef- 
fective radius for this star is quite large; this star also affects 
the effective radius of the first ~ f points on the right of 
the diagram, because its distance from the centre enters the 
computation of the average distance for the other stars. As 
long as the number of stars included in the computation 
of rofi increases, the results converge towards a more stable 
averaged distance. 

The smooth increase of r^s with decreasing mass con- 
firms the findings for the semi-major axis length of Sect. 19. f1 
Hence we can state that Wd f is clearly mass segregated. 



9.4 The origin of mass segregation for Wd 1 

Mass segregation is a phenomenon observed in many 
young c lusters; some examples are t he Orion Nebula 
Cluster (|Hillenbrand fc Hartmarml Il998l), the NGC 360 3 
Young Cluster (IStolte et all |2006| : iHaravama et al.1 12008|) 
the Arches Cluster jStolte et al.1 |2005; Ki m et al.n2006l : 
lEspinoza et al. 

I 120091 ') ■ and debate is still open whether 
this phenomenon is either primordial or due to dynami- 
cal evolution. If only two-body encounters are considered, 
the half-mass relaxation time for a cluster is given by 
IIBinnev fc TremaindlT987l l: 



6.5 X lOVr f M 
ln(0.4 7V) IO^Mq 



IM, 







< m > 



I pc 



where A'' is the total number of stars, < m > is the mean 
stellar mass and r\ the deprojected half-mass radius, equal 
to 4/3 of the projected half-mass radius. Considering N — 
10^, M = 5 X lO"* Mq, rh = 4/3 X l.I pc (see Paper I) and 
< m >= 0.6Mq, we get an estimate of trh ~ 130 Myr, much 
larger than the age of the cluster. This discrepancy between 
relaxation time and age is common to many clusters and 
has been used as an argument in favor of the primordial 
segregation scenario l|Bonnell fc Davie3ll998h . Nevertheless 
one has to consider that the time for a star with mass m« 
to drift towards the cluster centre due to dynamical friction 



< m > 

tdf = X trh 



In the case of Wd 1 this segregation time can be as short 
as 2% of tih, i.e. 2.6 Myr for a star of ~ 30Mq, i.e. the 
most massive stars in our sample. Hence primordial seg- 
regation would not be necessary to explain the observed 
mass segregation. Furthermore the evolution of Wd 1 has 
probably been more complex, and the value of might 
have changed in time. Mass loss from stellar winds, su- 
pernovae explosions and gas removal might have caused a 
global expa nsion of Wd 1, me a ning that trh was sh orter 



in the past iGieles et al] ll2010bl '). ICiirkan et al] l|2004l ') and 
iPortegies Zwart et al. I (|2004l ) showed that the core collapse 
time for massive clusters is about 0.1-0.2 t^h- It is therefore 



not unlikely that the core of Wd 1 has undergone a dynam- 
ical collapse, which is then also followed by expansion that 
could increase the relaxation time. This would push down 
the ma ss limit which we expect to be affected by mass segre- 
gation. [McMillan. Vesperini fc Portegies Zward (|2007l ') have 
proposed an alternative scenario to the primordial segrega- 
tion one, in order to explain mass segregation observed in 
young cluster. This scenario predicts that mass-segregated 
young, massive clusters could be the product of merging 
of several smaller subclusters. Substructure in molecular 
clouds is observed in both density and kinematics (IWilhamj 
1 19991 : 1 Williams et al.|[200ol ) and this substructure is reflected 
as well in young clusters ([L arson 199_5|; iTesti et al] l2000l : 
ICutermuth et al.ll2005l : lAUen et al. 20071 '). In a hierarchical 
formation scenario, a massive cluster could be formed by the 
merging of several subclusters and still might show mass seg- 
regation. Given their smaller N, the subclusters can rapidly 
reach a mass segregated status before they merge and re- 
gardless of the initial spatial distribution of stars. The mass 
segregation in t hese smaller c l uster s is fa vored by shorter 
trh- In addition. [Allison et al] (|2009l . l2O10l ) show that clus- 
ters may undergo an initial collapse phase which can signifi- 
cantly accelerate mass segregation. In this phase the cluster 
forms a very dense and clumpy core, where the massive stars 
can rapidly segregate given the short crossing time and large 
number of encounters. The timescale for such a process in 
a cluster with N ^ 10^ is < IMyr. The si mulations by 
[McMillan. Vesperini fc Portegies Zwarj (|2007[ ') additionally 
show that mass segregation developed by single subclusters 
is preserved during merging. Consequently the final massive 
cluster exhibits mass segregation at an age much smaller 
than its global relaxation time. While merging might not be 
required to explain mass segregation, it could also explain 
the elongated shape of Wd 1. Hence we think that this is a 
very interesting scenario for Wd 1 formation. We point out 
that Mc Millan et al.l (l2007ll have carried out pure N-body 
simulations. Recently, iBatd ((2009 ) has performed hydrody- 
namic simulation of star-forming regions that include gas 
drag and gas accretion onto stars, in addition to the mutual 
gravity between them. The final cluster that is formed is 
the result of the merging of 5 subclusters. The author finds 
no evidence for mass segregation. However the number of 
stars formed in his simulations is of the order of 10"^ with 
the most massive star of only ~ 5Mq. Hence the simulated 
cluster cannot be directly compar ed to Wd 1. More recently, 
using the final stellar positions of [Batel ([20091). and applying 
their own segregation detection method. iMoeckel fc IBonnelll 
(|2009l ) found evidence of segregation at least for the most 
massive stars. This last scenario, in which only the few most 
massive members are found in the cluster's core, is more 
similar to what is observed for the Trapezium stars in the 
ONC, than to what we observe in Wd 1, i.e. an evidence of 
continuous mass segregation across the whole stellar mass 
spectrum. 



10 CONCLUSIONS 

We have presented a new, thorough analysis of near-infrared 
data for the intermediate- and low-mass stellar population 
of the massive young cluster Westerlund 1. Using artificial 
stars, we have been able to sample spatial variations of pho- 
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Figure 11. Effective cumulative radius for Wd 1 stars. Note that 
the ~ 10 rightmost points are affected by the off-centre location 
of the one most massive star in our data. 



tometric completeness on a scale of few stellar FWHM . The 
same artificial stars have been used to infer realistic photo- 
metric error estimates, as well as the correlation between 
errors in different bands. Incompleteness corrections and er- 
rors were used to apply a novel statistical field subtraction 
technique to the data. Using a nearby control field we ob- 
tained a clean sample of cluster members. The clean cata- 
logue of stars, together with state-of-the-art stellar models, 
has been used to determine the cluster's properties. We de- 
rived an extinction Ako, = 0.91±0.05 mag, an age r = 4±0.5 
Myr and a distance d = 4.0 ± 0.2 kpc. 

We investigated the cluster's IMF slope using a new 
approach to stellar mass determination. The information 
on magnitude errors and their correlation has been used 
to derive the mass-probability-distribution for each star, 
given the best-fit isochrone. The completeness-corrected 
IMF has a slope of 7 = 2.441°' pg, slightly steeper than the 
Salpeter or Kroupa IMF; this slight discrepancy could be 
partially reconciled if we consider that, for the sake of sim- 
plicity, we are neglecting the infiuence of (unknown) un- 
detected binarity, hence our quoted error is probably an 
underestimate of the total, statistical plus systematic error 
l|Mafz Apellanij|2009l '). From the IMF slope and its normal- 
ization constant we found a total mass for the cluster of 
Mwdi =4.9llJ;I^ X 10*Mq. 

The spatially varying completeness, combined with the 
probabilistic mass determination, enabled us to investigate 
the spatial variations of the IMF. The Wd 1 starburst clus- 
ter is mass segregated, with massive stars more centrally 
concentrated. Other indications of mass segregation come 
from the analysis of the stellar density distribution. In order 
to study the 2D density distribution as a function of stellar 
mass, we fitted 2D elliptical profiles. This analisys revealed 
a tight dependency of the ellipses semi-major axis length on 
mass:a(m) oc l/log(m). Given the young age of Wd 1, its 
global mass segregation cannot be explained by simple 2- 
body relaxation. Interestingly, from the density distribution 
analysis, we found that Wd 1 is elongated along the Galactic 



Plane with an axis ratio a ; 6 = 3 ; 2. The mass segregation 
and the elongation together hint at a formation scenario in- 
volving the merging of multiple subclusters formed almost 
coevally in the parental giant molecular cloud. 
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APPENDIX A: COMPLETENESS MAPS 

In this appendix we will illustrate, step-by-step, how the 2D 
completeness maps for Wd 1 were obtainetfl. 

Al Adding and detecting artificial stars 

The basic idea is to use the same PSF that was obtained 
by PSF-fitting with DAOPHOT to add stars (using the 
DAOPHOT addstar task) in the reduced images and then 
run the same PSF fitting photometry scheme to see whether 
artificial stars can be recovered or not. 50 stars per run were 
added, in order not to change the crowding characteristic of 
the frame. Stars are positioned randomly on the frame and 
have an uniform distribution in magnitude. To achieve suf- 
ficient spatial resolution we iterate the procedure until we 
have added 4500 stars per unit magnitude. The actual size of 
the magnitude bin used is 0.5 mag, and we consider that ar- 
tificial stars are recovered if the detection yields a difference 
between the input and output magnitude of less than 0.5 
mag. The ranges of magnitude spanned by the simulations 
are those typical of the Wd 1 member stars detected in the 
NTT/Sofl observations, above the linear-regime limit and 
below the detection threshold i.e. J e [11.35, 19.35] mag and 
Ks £ [9.8, 17.8] mag; these magnitude ranges correspond to 
masses between 0.3 and 30 Mq, the exact values depending 

The corresponding IDL procedures will be made available by 
the authors upon request. 
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on the age, distance and extinction values (see Sect.[7l). The 
full magnitude ranges are divided in 16 bins, 0.5 mag wide, 
per each of the two bands. 

With an effective detector area of A^s = Lx x Ly = 
876 X 920 pixeP, resulting from the area in common between 
the observations in the different filters, the numbers we just 
quoted correspond to a typical separation between simulated 
stars (within the 0.5 mag bin) of: 



< d > = 



^cff 



10.8 pixel 



(Al) 



where Ni,in = 2225 is the number of simulated stars in the 
0.5 mag- wide bin. Given the Soft plate scale of O'.'29/pixel 
and given the typical seeing of « 0'.'8 it means that we sam- 
ple the whole frame on a scale which is about 3.5 times the 
PSF Full- Width at Half Maximum (FWHM). The effective 
sampling scale is a bit larger, due to the use of a certain 
number of nearest neighbours to calculate the local value 
of completeness at the position of each simulated star. The 
natural limit, i.e. the minimum length-scale at which com- 
pleteness can be sampled by our method, is the FWHM of 
the PSF itself, which characterizes the ability to distinguish 
two different point sources. The resolution of the incomplete- 
ness map could not be improved further, even if the number 
of simulated stars would be increased in order to achieve a 
spatial sampling smaller than the PSF-FWHM. Our choice 
of the total number of stars is a compromise between a short 
sampling scale and a reasonable number of simulations. 

A2 Building the 2D maps 

Each simulated star can be either recovered or not by 
DAOPHOT PSF fitting, meaning that for that specific star 
completeness is either or 1. Starting from this series of 
sparsely sampled 0-s and 1-s, several steps are necessary to 
obtain a smooth function, which is determined at each point 
on the frame. In the following we will indicate the position 
of simulated stars with a hat symbol, (£, y), while the coor- 
dinate grid on which we actually calculated the function will 
be simply {x,y), which corresponds to the pixels grid of the 
chip. We will refer in this section only to a single magnitude 
bin and to a single photometric band; interpolation in the 
magnitude dimension will be treated in Sect. IA3l 

The first step is to create average completeness values 
at each {xi,yi) for i = 1, Nbin. This is accomplished by con- 
sidering a certain number v of nearest neighbours to the i-th 
simulated star and defining the completeness fraction as: 

„ ,^ recovered stars , , 

Co{x„y^) = ; (A2) 

where the recovered stars are counted among the u neigh- 
bours of the i-th one, which is also included, hence the +1 
in the denominator. The actual value of v is somewhat ar- 
bitrary and has to satisfy two opposite requirements; the 
higher it is, the less the completeness values will be affected 
by statistical noise. On the other hand, a too large value 
would imply a loss in spatial resolution for our complete- 
ness maps. As mentioned in Sect. lAll the effective sampling 
scale is not the < d > of equation (lAlf) but more precisely: 



which degrades our completeness sampling scale by a factor 
4, giving dcff ~ 43.2 pixels, corresponding to about 14 times 
the FWHM of the image PSF. 

At this stage the Co is known only point-wise in the 
set of {x, y) positions occupied by the simulated stars. The 
next step is to interpolate this function into a regular 
grid of points. This is accomplished via the IDL procedure 
GRIDDATA, using the Kriging method of interpolation with 
an exponentially decreasing model for the variogram. Krig- 
ing allows to interpolate a random field known in a set of 
positions into another set, under some assumptions about its 
covariance. In our case the random field is the completeness 
itself, with its Poissonian error due to the finite number of 
simulated stars considered in equation (|A2|) . An exponential 
model for the covariance is appropriate here, because the es- 
timated values of Co at some location (xi,yi) are correlated 
with those for other stars and the correlation is stronger for 
closer simulated stars than for those separated by a large 
distance. We have chosen an e- folding scale equal to d^s- 

After the interpolation we performed a smoothing of 
the completeness. The grid used for the interpolation is in- 
deed finer than d^s, meaning that the interpolated function 
may show artificial variations on a scale smaller than our 
minimum size, which would be unrealistic. That is why we 
additionally smoothed the maps with a boxcar kernel with 
a size of dcff. The boxcar model is appropriate given the 
uniform spatial distribution of simulated stars. 



A3 Interpolation in magnitude 

Once the maps are available in magnitude layers, we en- 
forced that at each location {x,y), completeness is a de- 
creasing function of magnitude. We fitted pixel-by-pixel a 
monotonically decreasing function of Fermi-like type: 



Cj{x,y,p-) 



a{x,y) 



(A3) 



dcfi =< d > . 

After several experiments we decided to use v 



16, 



The meaning of the three coefficients is the following: 

a is the normalization and is always ^ 1; 

/3 is the magnitude for which completeness is a/2; 

7 represents the rapidity with which Cj drops down. 

Once the (a, /?, 7) coefficients triplets are calculated, it is 
straightforward to assign to each real star its completeness 
value using the coefficients evaluated at the star's position. 



A4 Completeness for the control field 

The offset field that we used for field decontamination of 
the colour-magnitude diagram (CMD) is also affected by in- 
completeness. In this case, however, it is not necessary to 
investigate the 2D structure of the completeness pattern; 
under the assumption that the spatial distribution of the 
stars in the control field is uniform, we only consider spa- 
tially uniform incompleteness correction. 

When using a control field for decontamination of a star 
cluster's CMD, one implicitly assumes that the stars are, on 
average, representative of the foreground/background pop- 
ulation in the cluster's field. This assumption has a series of 
shortcomings. For example, the copious cluster population 
itself may partially "shield" background stars. In addition 
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to that, in the Galactic Plane, variable extinction may cause 
differences in the observed population of stars. Furthermore, 
the population along different lines of sight could be intrin- 
sically different, due to the different galactocentric distances 
sampled at the same heliocentric distance or changes within 
the spiral arms. This two latter problems are reduced by 
choosing nearby fields, so that the fore/background popula- 
tions show similar distributions in age and extinction - hence 
in magnitude and colour - as along the cluster's line of sight. 
Hence the choice of the control field is done in order to have 
a population that on average looks like the contaminating 
population in the cluster frame. 

For these reasons it is not necessary to try the same 
2D approach to assign completeness values to the off-cluster 
frame stars. We only populated the whole frame in a uniform 
way with 250 stars per each 0.5 magnitude-wide bin; only 
50 stars were added in each run, not to alter the crowding 
characteristics of the field. Then we computed the fraction 
of recovered stars over the total number of simulated ones 
and fit a function like that of equation (|A3|I , this time with- 
out any spatial dependence. Last step was to assign also 
to the single stars in the control field their corresponding 
completeness value in each photometric band. 



APPENDIX B: EVALUATION OF THE 
PHOTOMETRIC ERRORS AND THEIR 
CORRELATIONS 

In Paper I we used simulated stars to estimate the photo- 
metric errors. We showed that for stars with known input 
magnitudes the output magnitudes were often in disagree- 
ment at a level of more than la, where the the DAOPHOT 
fitting errors were taken as a values. Hence, the difference 
between input and output magnitude seemed to be a more 
conservative and robust estimate of the real photometric 
error. Simulated stars are used here also to estimate the 
correlation between magnitude errors. The photon counts 
associated to an isolated star in two different bands are un- 
correlated. In reality, even though the photon counts are in- 
dependent, the inferred magnitude values may not be. The 
reason why J and Ks magnitude errors are correlated is the 
presence of bright stars or, more generally, crowding. When 
a faint star is located close to a bright star, the residuals 
of PSF fitting procedure of the bright star (which is usually 
bright in both bands) may lead to magnitude errors in both 
bands. If the bright star's wings are not properly subtracted, 
then there will be an excess in the flux that is assigned to the 
nearby faint star. The bright star's wings may also be over- 
subtracted (e.g. because the core is not well fltted), leading 
to too small flux estimates. This can lead to a correlation of 
the photometric errors. Crowding from stars of comparable 
magnitudes will lead to a similar behaviour. 

That J and Ks magnitude errors are correlated is 
obvious from Fig. IBll In the left panel we show J""' — 
J™ vs. K°^^ — K^" for the simulated stars. In the right panel 
we show (J - i^s)""' -{J- KsY" vs. K"""' - K'" . Since the 
two magnitude estimates are correlated, the composed quan- 
tity J — Ks is also correlated to the single magnitude values. 
The coefflcient r in the flgures is the Pearson's correlation 
coefficient for the whole sample of simulated stars, i.e. 
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Figure Bl. Left: correlation between J and Kg magnitude er- 
rors for the simulated stars. Right: same, but using (J — K^) 
colour instead of J magnitude. The r quantities are the values of 
Pearson's correlation coefficients for the two samples. 

E,(x,-X)(y.-y) 

where X and Y are the respective abscissa and ordinate in 
the two plots. From its definition it is clear that Pearson's r 
is equivalent to the X and Y covariance divided by the prod- 
uct of X and Y standard deviations. A value of r very close 
to -1-1 (-1) indicates a very tight correlation (anticorrelation) 
between the two random variables, while two totally uncor- 
related random variables would show a value of r = 0. The 
values of the Pearson's coefficients in Fig. lBll indicate a quite 
significative correlation of the magnitude errors as well as an 
even tighter anticorrelation between magnitude and colour 
errors. Given that r is not a robust, outlier-resistant quan- 
tity, the actual values were calculated removing the outliers, 
i.e. stars for which input and output magnitudes differ more 
than 1 mag in at least one band . The number of stars inside 
this limits is 97 % of the total number of simulated stars. 
Therefore, the exclusion of the outliers does not represent a 
shortcoming in the evaluation of a robust estimate for the 
overall correlation coefficient of the sample. 

Bl Assigning proper photometric errors and their 
correlation to each detected star 

For each detected star we selected a number of at least 7 
simulated stars (using the same stars of Sect. lAl) that were 
positioned in its neighbourhood. By neighbours we mean sim- 
ulated stars whose distance from the position of the real star 
was not larger than 50 pixels and whose magnitude differs 
no more than 1 magnitude - in each band - from the real 
star. The distance of 50 pixels represents the average radius- 
of-tnfluence of the bright stars, i.e. the typical extent of their 
halos, as derived by the analysis of Wd 1 NTT/SofI images. 
For each of the neighbours in the subsample we calculated 
jout _ Jin and/i"""* - K^". The standard deviations of the 
two quantities, within the subsample, have been used as es- 
timates for the photometric errors of the real stars. We also 
calculated the Pearson coefficient between the two quantities 
in the neighbours subsample and assigned it to the detected 
star. The comparison between DAOPHOT errors and our 
newly estimated errors for the Wd 1 field are shown in Fig. 
IB2I The new error estimates are, on average, larger than 
what predicted by DAOPHOT, especially in the Ks band. 
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Table CI. Detections in the on and off frames. 



Field 



Number of stars 



on 7036 

off 5381 

on (after subtraction) 5810 it 25 

on (after subtraction and cr-clipping) 4300 ± 23 



0,n OA 0.2 0.3 0,4 0.5 0,0 O.l 0.2 0.3 0.4 0.5 

DAOPHOT J band errors [mag] DAOPHOT K, band errors [mag] 

Figure B2. Comparison between DAOPHOT photometry errors 
and our newly derived errors for J band (left) and Ks band in the 
case of Wd 1 field. The dashed lines are 1:1, 2;1 and 3;1 relations 
between new and DAOPHOT errors. 



Some of the real stars do not have a sufficient number 
of neighbours to perform this kind of estimate. This is true 
especially for faint stars, since the majority of the simulated 
stars at the faintest magnitudes cannot be recovered. Hence 
these simulated stars cannot be used for the error estimate, 
because they don't have a M°"* value. For stars without 
enough useful simulated neighbours, we used a different er- 
ror estimate. We first divided the real stars for which the 
error determination worked fine in several magnitude bins. 
Then we calculated the mean error per each magnitude bin 
and fitted an exponential relation to the mean error vs. 
bin-magnitude points. This relation has been used to as- 
sign their errors to the stars that lack a sufficient number 
of neighbours. The new errors, as a function of the stars' 
magnitudes are shown in Fig. [S] The figure shows the ex- 
ponential extrapolation used to determine the errors of the 
faintest stars. Together with the new errors for the cluster's 
field stars, we show, in red, the new errors for the control 
field stars, whose derivation is illustrated in Sect. IB2I 



B2 Photometric errors for the control field 

A similar method was used to derive new photometric errors 
for the control field stars. Since we assume that these stars 
are uniformly distributed, there is no need to treat the spa- 
tial variation of the errors. Using the same simulated stars 
as in Sect. El we computed the AMj(i) = M°"*(i) -Mj"(i) 
for Mj = J, Ks and i running over the simulated stars; then 
we binned the stars in 0.5 mag wide bins (in the input mag- 
nitudes) and for each bin we computed the standard devi- 
ation of the AMj over the bin. The last step was to fit the 
(a[AMj], Mj[bin]) points with an exponential relation; here 
Af,[bin] is the central magnitude of the bin. This relation 
was used to assign an error to the real stars as a function of 
magnitude. The average value of the correlation between J 
and Ks was calculated for the whole sample and is r = 0.25. 
This value was assigned to each real star in the control field. 

Figure [3] shows that, on average, the photometric errors 
in the control field are smaller than those in the cluster's 
field. This behaviour is expected, and can be explained by 
the higher degree of crowding for Wd I's field. For the same 
reason, the detection limit for the control field is ~ 0.5 mag 
fainter than the Wd 1 field in both photometric bands. 



APPENDIX C: a - CLIPPING 

Because of some dissimilarities between the on- and off-field 
fore/background populations, the CMD of Wd 1, even after 
subtraction, does not look perfectly clean. For this reason, 
after having chosen the best fitting isochrone, i.e. the 4 Myr 
one, before any further analysis, we additionally subtracted 
those stars that lie more than 3cr away from the reference 
isochrone in the magnitude-magnitude space (see the end of 
Sect.[Sl). After clipping, essentially all stars with colours and 
magnitudes consistent with the 4 Myr cluster population are 
included in the final source selection. Our clipping criteria 
may retain some arbitrariness; nevertheless they do not af- 
fect our further analysis. The main reason is the cut only 
affects the faint stars, with large photometric errors. Some 
of them could be excluded or included in the catalogues by 
slightly changing the a threshold. Anyway, in the compu- 
tation of the IMF (see Sect. [HJ and of the stellar density 
(see Sect.[9|, we only consider stars above a given complete- 
ness or mass threshold. Stars with uncertain membership 
are mostly excluded by these two additional cuts hence they 
do not affect the final results. 

One realization of the clean cluster's CMD is shown in 
Fig. [5] together with the best fitting isochrone. The error 
bars shown in the diagram are the average J — Ks and Ks 
errors per magnitude bin. The colour errors are calculated 
for each star as: 

5{J - Ks) = ^/a^{J) + a^Ks) + 2rjKs^{J)G{Ks) (CI) 

Pearson's r is equal to the covariance divided by the product 
of the two standard deviations (see eauation lBl|) : hence the 
third addend on the r.h.s. of equation (|Cip is equal two twice 
the covariance of J and i^s- 

In Table [CTI we summarize the number of stars left after 
field subtraction and additional clipping. The mean values 
and their uncertainties are derived by iterating the proba- 
bilistic subtraction technique. We repeated the extraction 
of C, for each star, to generate 100 different catalogues (see 
Sect. [5]). We then calculated mean and standard deviations 
of the number of members over the 100 samples. 



APPENDIX D: BOOTSTRAP ESTIMATE OF 
THE IMF PARAMETERS AND THEIR ERRORS 

Bootstrappin g is a resampling technique for error estima - 
tion (see e.g. lEfronlll979l : iHastie et al.l l 20091 : lAndra3l2O10l ). 

Given a data set from which some parameters are estimated, 
bootstrapping consists in resampling the data to create al- 
ternative data sets. From these, it is possible to repeatedly 
estimate the parameters of interest, monitoring their dis- 
tribution. We generated 10^ bootstrap samples to probe 
the parameter space of (7,^4), assuming for the IMF the 
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functional form 



diV(m) 



, — A x m . From our data set, 

dm ' 

we created 100 different realizations of the members cat- 
alogue. Each catalogue has a slightly different number of 
members Nc,i withj = 1, 100, after statistical field sub- 
traction and (T-clipping (see Appendix [C|. From the mem- 
bers of each j-th catalogue 1000 bootstrap samples were 
created. The new samples consist of the same number of 



made with replacement, i.e., the same star can occur mul- 
tiple times in a bootstrap sample. This sample of stars 
is then used to build the IMF as in equation (|4]) where 
now i runs on the stars of the specific bootstrap sample. 
At each iteration a power-law fit is performed to obtain 
a couple (7j,fc, Aj_fe) with j = 1,100 and fe = 1,1000. As al- 
ready detailed in Sect l8.2l the fitting interval is restricted 
to m £ [3.5, 27] Mq. Given the (7j,fc,^j,fc) values, we ob- 
tained the corresponding total mass, Mj^k, and total number 
of stars, Nj^]t, by integrating the power law in the interval 
m e [0.08, 12OM0]. 

A 2D density plot of the output values (7, A) is shown 
in Fig. IDll It is clear that the 7 and A parameter are tightly 
correlated. This is easy to understand. For each bootstrap 
sample we have a number A'fit of stars that are actually in- 
side the fitting interval. Given the different catalogue real- 
izations, this number can be slightly different, but is mostly 
in the interval [1250, 1500]. The IMF fit has to satisfy the 
condition: 



Nat ^ Ax 



m "'dm 



From which we get: 



A : 



A^flt(l-7) 



271 



3.51 



This relation between A and 7 is overplotted in Fig. IDll 
for Nat = 1000, 1250, 1500, 1750 (dotted lines). Given that 
the 2D distribution of (7, A) pairs is clearly non Gaussian, 
the definition of the best values and the confidence intervals 
for the two parameters is not straightforward. The maxi- 
mum of the 2D distribution is located at (7max,Amax) = 
(2.46,1.31 X 10*). With this pair of values we obtain a 
total mass, Mmax = 5.13 x 10* Mq and a total number 
Nraax = 1.10 X 10^ stars. 

On the other hand, using the 2D joint distribution is not 
the most suitable choice for defining the best values and con- 
fidence interval for the parameters (7, A) and for Mtot and 
A'tot. For this purpose, in the case of 7 and A, we used the 
marginal distributions. These are obtained by integration of 
the joint distribution with respect to the other variable. For 
Mtot and A'tot we similarly used the distributions of Mj,fe, 
and Nj^k obtained after each bootstrap iteration. The best 
values are obtained by maximizing the distributions. The 
confidence intervals are obtained by integrating the distri- 
butions from left and right until 16% of the total area under 
the distribution is reached on each side. This means that the 
limits of the asymmetric confidence interval comprise 68% of 
the total area under the distribution function. The marginal 
distributions for 7 and A, as well as the distributions of 
Mj^k, and Nj^k are shown in Fig. ID2I The best values and 
the confidence intervals are given in Table iDl] 




Figure Dl. Density of the occurrencies of (7, A) from our boot- 
srap procedure. The black dot indicates the location of the max- 
imum of the 2D density: (7m,Am) = (2.46,1.31 X 10*). Dotted 
lines are lines of constant number of stars in the fitting interval; 
from bottom to top N^t = 1000, 1250, 1500, 1750. 



Table Dl. Best values and their confidence intervals for the IMF 
parameters, the total mass and the total number of stars of Wd 1.. 



Quantity 



Best value Lower limit Upper limit 



7 

A/10* 

Mtot [10* Mq] 

A'tot /lO* 



2.44 


2.36 


2.64 


1.22 


1.08 


1.78 


4.91 


4.42 


6.70 


10.4 


8.6 


16.4 




Figure D2. Prom left to right: Marginal distributions for 7 and A and distributions of Mjj^, and Nj i^. The best values are marked by 
dotted lines; the confidence intervals are marked by dashed lines. 



